SCIENC[DIS,DBL]2 - www.SailDart.org

perm filename SCIENC[DIS,DBL]2 blob sn#210302 filedate 1976-04-14 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00005 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.NSECP(Some Assumptions about Discovery)
C00006 00003	.SSEC(Designing Math Theorizers)
C00008 00004	.SSEC(Scientific Discovery as Heuristic Search)
C00025 00005	.SSEC(Other Fields and Other Problems)
C00026 ENDMK
C⊗;
.NSECP(Some Assumptions about Discovery)

Ideally, a scientific thesis should state its assumptions at the beginning
and its conclusions at the end. 
In Artificial Intelligence theses, one or both may be omitted entirely.
Somewhat recklessly, I've tried to combine those two chapters into one,
since both are easily separated from the rest of the project.

Section 1 bares the underlying basis of AM: all the previously-hidden
assumptions about what constitutes "doing math research". 
First, let me
"admit" the existence of such a basis of beliefs, an interlocking set of
constraints which are (i) believed by the author, (ii) quietly interwoven 
throughout AM, and yet (iii) can't be readily tested by AM.
Much of this foundation deals with what to ⊗4ignore⊗*: incubation, subconscious,
human drives and taboos, political climate,  and factors which I'm not even aware
of ignoring.
The rest of the basis is more positive. In this section, a detailed model of
-- perhaps a recipe for -- making math discoveries will be expounded. The reader
will see that AM encorporates many of the details, in all-encompassing ways which
can't be separated from the rest of the system and examined.
For example, there is the postulate that heuristics can be useful for both
suggesting new things to consider, and for keeping that space of plausible items
pruned to a reasonable size. By the very manner that heruistic rules are
coded and used, these two functions are blended together in AM.


What is the implicit
⊗4model⊗* of scientific discovery that this project assumes?  
This is a more general question, and is discussed in Section 2 of this chapter.
Some of the results of the last chapter will shed a few weak lights on some of
the assumptions. When this occurs, it will be blended into the discussion of the
model.

If there really is a general recipe for constructing theory-formers, what is it?
What kinds of domains are ripe for attack, and what features make a domain
impregnable to AM's variety of exploration? This high-level discussion of
the AM-like systems' uses and limitations will form the last section of this
chapter.



.SSEC(Designing Math Theorizers)

Theoretical issues, constraints. Model of math research.
 This is "one way things could be done...". It is neither the sole
 "correct" solution, nor is every nuance captured in the AM system itself.

a) Research in various domains of sci., and of math, proceeds slightly differently
	Some examples of this...

b) If we want a system to work in many domains, we'd have to sacrifice some power.

c) This brings up the choice of domain. 
	Why math? Why elementary set theory, instead of a more sophisticated field?
	What else could it be/ not be? Why not a simpler filed like logic?

d) Detailed model of math research
The "raw materials" for AM's power may derive from the fact that it was based
on a detailed (and at least plausible) model for math research,
based on writings by Polya, Poincare, Hadamard, etc.

What are the peculiarities, the details of MATH research, that provide AM's
power? Why would it be weaker if not specific to math?

e) Implications for an automated mathematician
What constraints, design features, are required/suggested by the
domain-specific features of the model of math research?
(All the rest must therefore be random implementation hacks.)

.SSEC(Scientific Discovery as Heuristic Search)

Similar to above, but generally applicable to any simulatable empirical science.
View AM as providing a suggestive model for how to emulate a researcher.
This model is of course a (modified) heuristic search paradigm.

In other words, now consider what constraints/suggestions are made simply by
deciding that we want a system which will do scientific theory formation.
This will isolate them from tose that were really specific to mathematics.
While the following material may all be included, it's order may be changed.

.B APART

a) Scientific discovery as heuristic search: evolution of the model
	Each node in the space corresponds to a concept
		Concepts can be static (Sets) or Active (Composition)
		A relnship. (e.g., a theorem) is itself a concept
		An argument (e.g., a proof) is also considered a concept
	The "legal" moves (ways to expand the tree of nodes, to grow new ones)
		are too numerous to be seriously considered.
        The real operators are themselves heuristic rules of thumb
	So the "space" itself weeds out all nodes except those proposed for some
		good heuristic reason.
	As simple numerical calculations show, this space is still enormous.
	Using the big-switch idea, we can restrict our attention to 1 domain
		(e.g., math) and only use ITS nodes and heur. operators.
	Even so, space is still too big
	Refine the big-switch idea: 
		When worrying about nodes N1...,Nk, only consider heuristic
			operators known to be rele. to those concepts.
	Even so, the space is still too big
	We recurse: we use heuristics to reduce our search. These new meta-heuristics
		(strategies) guide our attention (which nodes to look at next,
		which operators are most promising to apply to each selected node).
	If these are good enuf, then we are through (else consider meta-meta-heurs.)

b) The model finally produced
	Recapping, we state explicitly how we decide what to do next at any moment.
	Indicate the data structures, the flavors of heuristics, control flow, etc.
			Note that "apply heur. operator F and add corresponding
			new node N" is considered primitive for the moment.

c) Evidence in favor of (empirical validation of) that simple model
	i) A prediction: Character of interdisciplinary research
	   For a novice in some field to do new research, he must learn the rele.
		already-known concepts, and (probably) must learn the rele. heurs.
		(e.g.: bubble-chamber physics exeriments, molec. genetics)
	   On the other hand, if he has expertise in another field, he brings with
		him many new heurs. to apply (hopefully a couple carry over and were
		never applied before in this new field), plus he brings with him the
		knowledge of a new net of concepts, from which he may draw analogies
	   Because of this, interdisciplinary research can be very productive 
		especially if you're the first such link (e.g.: Suppes)

	ii) Turn the model upside-down: Analyzing a given discovery
	   When we hear a new discovery, we try to perceive a path backwards from
		it, connecting to concepts we already know. The easier this is,
		the less mystified we are by the discovery. 
		Consequence: Discoveries in alien field seem magical and v. hard
		Consequence: Let-down after seeing how a magic trick is performed
		Consequence: Let-down after seeing how an AI program really works.
	   Reasons why going backwards is probably easier than going forwards
		Easier since you have a given starting point (the given discovery)
			The alternative is to find the "right" set of known concepts
			which will eventually lead to some interesting new concept.
		Easier since the target space is huge (get to any known concept,
			vs get to any very interesting concept)
			Since nature is unkind, and very few avenues lead to 
			interesting new discoveries, valuable new concepts.
			On the other hand, in working backwards with our heurs,
			we worry about branching also, and must be able to quickly
			tell if we are really simplifying the situation!
			(i.e., Even if the model is right, we must herein worry
			 about the branching factor when going in reverse)
		Easier if you are shown the discovery step by step
			Since then you will know the necessary intermediate concepts
			Since then each step will seem "easy" and obvious
		Consequence: Reading journal article, feel "I could have done that"
		Consequence: Work 3 years, make discovery, kick yourself for not
			having seen it earlier, since it was so obvious.
		Consequence: given a math or physics problem, you're more impressed 
			with the solution if you spend (waste?) a few hours trying 
			to solve it, than if you just read thru the soln. at once.
	iii) "Failure" is due to missing some "right" heurs/concepts, or the wisdom
		(i.e., strategies, meta-heuristics, etc.) to use them effectively.
		Consequence: Teaching by example forces students to induce the 
			(meta-)heuristics themselves, often imperfectly.
			E.g.: many GP's failed the antibiotic drug prescription test
		Consequence: One can -- and should -- teach the strategies directly
			<is this verified anywhere? counter-indicated? Polya? >
	iv) Momentous occasion in science is typically due to discovery of a brand
		new heuristic, or else due to creation of a new concept unconnected
		to the existing concepts by some plausible operator.
		Occasionally, all one finds is a daring interdisciplinary analogy.
		Occasionally, just the first to follow a perfectly plaus. path.
		Examples: 
		  Non-Euclidean geometries ("Counter-intuitive systems may
			still be consistent and interesting")
		  Relativity ("Counter-intu. sys. may even have physical
			reality; simultaneity may be a local superstition)
		  Knuth's (Conway's) Surreal Numbers (no contacts, not
			easy to see how/why their defn. was ever considered int.)
		  Schroedengier's wave equation (plucked out of thin air)
		  Methods of complex analysis (orig. based only on analogies)
		  Mendelev's periodic chart (first one to write down the
			recently-discovered data about atomic wts. systematically)
		  < ... more examples ? ... >
	v) Presence of Zeitgeist in Science: often, a discovery is made simultaneously
		Because many researchers simult. hear some fresh concepts, and apply
		the same bag of tricks. 
		Example: calculus (Newton, Leibnitz), ...
			
d) Questioning that model
	i) Misleading character of polished results
		It would seem, from reading texts and journals, that science itself
		is more a flowing, smooth development, than a backtracking search.
		This is illusory, as we scientists know. (we fend off this attack!)
	ii) Omissions:
		Serendipidy, incubation, unconscious, wealth of analogic materials
			(mass of introspections from great scientists are mystical)
		Error: accidental good fortune (Franklin using string, not wire;
			Lederberg picking one of the few bacteria that really do
			reproduce sexually; Galois forced to cram a lifetime of
			creativity into one pre-duel burst of writing).
		Inability to explain why "long-shot" investigations were undertaken.
			(eg: superconductivity discovered as a Master's project)
		Zeitgeist: popular trends/fads in science at the time;
		Cultural and political themes popular at the time.
			In CS: Heirarchy: Prussian army, bureaucracy
				Cooperating modular experts
				Tremendous dependence on existing hardware
				(also true of numerical analysis)
			Social taboos against experimenting with human subjects
		Difficulty of generating new operators and unconnected nodes
			Defense: this really IS rare, so again the model is OK
			Rebut: But it DOES happen every now and then!! How?!
		Focus of attention
			People inherently are biassed in favor of recently-considered
			nodes; they can't flit back and forth from one leaf to another.
			This can be simulated in a heuristic search (by artificially
			weighting the concepts and heuristics based on recency of
			use), but it is	not an inherent part of the model.
	iii) Getting down to earth: limitations of the model
		What kinds of creative activities is this a bad model for?
			Brainstorming (intentional lack of plausibility)
			Problem-solving (very specific goal)
			Situations where there are very few heuristics (prop. calc)
			Situations where the major difficulty is applying a heuristic
				once it's chosen (e.g., soft fields like sociology)
			Absurdly robust or delicate chains of discoveries
				A delicate chain is really like problem solving
				(there exists only 1 interesting concept in this
				part of the space)
				A very robust section of space needn't worry about
				using heuristics (if every move produces something
				very interesting)
		Exactly what situations does it honestly capture?
			Essentially undirected research in a very hard science,
			with only secondary kinds of discoveries anticipated.
			Many heuristic operators exist for any given node,
			and a few meta-heuristics exist for the domain.
			Where the percentage of int. nodes "out there" is low
			  (under 10%) but not negligible (exists only one!)

		Pragmatic limitations
			Most seriously, it treats "add new node N" as primitive.
			In real life, people spend much time "adding a node".
			They have to answer many questions about it, play with it,
			and try to relate it to other known concepts. In this way,
			the worth of the node is estimated, and new empirical data
			is gathered which may trigger some heurs. to suggest the
			next drection to take, the next concept/relnship to explore.

e) Fixing up the model
	i) The most serious pragmatic limitation is this business about how much
		work must be spent to fill in any new concept. Fix this up by
		assuming that each concept has facets, not all of which need be 
		filled in at its conception. Then some heuristics (meta-heurs) can
		be concerned with tasks at the level of filling in a facet
		(deciding which facet of which concept to work on next). The basic
		control structure can in fact be oriented around filling in facets.
		If desired, the creation of new nodes can be a side effect!
		Give diagrams illustrating all this.

.E

.SSEC(Other Fields and Other Problems)

Why try to automate math/other sciences? 
What domains would be reasonable/bad? Why?

For some of the a propos domains, sketch how this could be done.